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Abstract 

In the present paper, we prove that the Wasserstein distance on the space of continuous 
sample-paths equipped with the supremum norm between the laws of a uniformly elliptic 
one-dimensional diffusion process and its Euler discretization with N steps is smaller than 
0(iV~ 2 / 3+e ) where e is an arbitrary positive constant. This rate is intermediate between 
the strong error estimation in 0(iV -1 / 2 ) obtained when coupling the stochastic differential 
equation and the Euler scheme with the same Brownian motion and the weak error estimation 
0(N^ 1 ) obtained when comparing the expectations of the same function of the diffusion and 
of the Euler scheme at the terminal time T. We also check that the supremum over t g [0, T] 
of the Wasserstein distance on the space of probability measures on the real line between the 
laws of the diffusion at time t and the Euler scheme at time t behaves like 0(y / \og(N)N~ 1 ). 
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For a : M — > R and b : M — > R, we are interested in the simulation of the stochastic differential 
equation 

dX t = a(X t )dW t + b(X t )dt (0.1) 

where Xq = xo G M. and W = (Wt)t>o is a standard Brownian motion. We make the standard 
Lipschitz assumptions on the coefficients: 

3K G (0, +oo), Vx, y G M, \a(x) - a(y)\ + \b(x) - b(y)\ < K\x - y\. 



For T > 0, we are interested in the approximation of X = (Xt)t^[o,T] by its Euler scheme 
X = (Xt)te[o,T] with N > 1 time-steps. We consider the regular grid {0 = to < t\ < ti < . . . < 
tN = T} of the interval [0, T] with tf- = ^ and define inductively Xo = xq and 

X t = X tk + a(X tk )(W t - W tk ) + b(X tk )(t - t k ) for t G [t k ,t k+1 ]. (0.2) 
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It is well known that the order of convergence of the strong error of discretization is TV 1 / 2 . 
Indeed, we have (see [17]) 



Vp > 1, 3C < +00, ViV > 1, E 1 ^ 



sup \X t 

t<T 



X, 



< 



c 



(0.3) 



See Section 1 for a more precise statement. This upper-bound gives the correct order of con- 
vergence since according to Remark 3.6 [20], when a and b are continuously differentiable, 
(y/N(Xt — X t ))t<T converges in law as N goes to 00 to some diffusion limit which is non zero 
as soon as a is positive and non constant (see also [21] and [15] where stable convergence is also 
proved). When a is constant, then the Euler scheme coincides with the Milstein scheme and the 
strong order of convergence is N^ 1 . 

On the other hand, the order of convergence of the weak error of discretization is always TV" 1 . 
For example, according to [31], when a and b are C°° with bounded derivatives of all orders and 
/ : R — > R is C°° with polynomial growth together with its derivatives then, for each integer 
L > 1, the expansion 



E[f(X T )]-E[f(X T )]=J2^i+0(N 



1=1 



N l 



(0.4) 



in powers of N 1 holds for the weak error. The bound \E[f(X T )} - E[f(X T )}\ < % holds when 
a, b and / are C 4 with the same growth assumptions. When / is only assumed to be measurable 
and bounded, it is proved in [2, 3] that the expansion (0.4) still holds for L = 1 if b and a 
are smooth functions satisfying an hypoellipticity condition. Under uniform ellipticity, [13] even 
extends this expansion by only assuming that / is a tempered distribution acting on the densities 
of both Xt and Xt- 

In view of financial applications, the weak error analysis gives the convergence rate to of the 
discretization bias introduced when replacing X by its Euler scheme X for the computation of 
the price E[/(Xt)] of a vanilla European option with payoff / and maturity T written on X. Let 
C denote the space C([0,T],R) of continuous paths endowed with the sup norm. When dealing 
with exotic options with payoff F : C — > R Lipschitz continuous, 



|E[F(X)] - E[FpQ]| < E\F(X) - F(X)\ < 



C 



N 



where the second inequality follows from the strong error estimate. But the first inequality is 
very rough and prevents from taking advantage of the cancellations in the mean which occur 
and permit to obtain the upper-bound ^ for vanilla options. The weak error analysis has been 
performed for specific path-dependent payoffs, typically when F{X) = f(XT,Yr) with Y t a 
function of (X s )o< s << such that ((Xt, it))o<t<T is a Markov process. The cases Yt = f^Xgds 
and Yt = maxo< s <£ X s respectively correspond to Asian [30] and barrier [9, 10, 11] or lookback 
options [27]. But no general theory has been developped so far to analyse the weak trajectorial 
error. The Wasserstein distance between the laws C(X) and C(X) of X and X defined by 



m(c(x),c(x)) 



sup \E[F(X)}-E[F(X)]\, 

F:C^R:Lip(F)<l 



where Lip(F) denotes the Lipschitz constant of F is the appropriate measure to deal with 
the whole class of exotic Lipschitz payoffs. Notice that this distance has already been used in 
the context of discretization schemes for SDEs : in the multidimensional setting, by a clever 
rotation of the driving Brownian motion, Cruzeiro, Malliavin and Thalmaier [4] construct a 
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modified Milstein scheme which does not involve the simulation of iterated Brownian integrals 
and with order of convergence A^ 1 for the Wasserstein distance. A simpler scheme with the 
same convergence properties is exhibited in [16] for usual stochastic volatility models. 

The weak and strong error estimations recalled above imply that 

3c, C < +00, VAT > 1, ^ < Wi(£(X), C{X)) < J=. (0.5) 



A very nice feature of the Wasserstein distance is its primal representation in the Kantorovitch 
duality theory. This representation is obtained by choosing p = 1, E = C and (fi, u) = 
(£(X),£(X)) in the general definition 

W p (fi,v)=( inf / \x-y\ p 7r(dx,dy)) ^ (0.6) 

where p £ [1, +00), (E, | |) is a normed vector space, fj, and v are two probability measures on E 
endowed with its Borel sigma-field and the infimum is computed on the set v) of probability 

measures on E x E with respective marginals fi and v (see for instance Remark 6.5 p95 [29]). 

- C — C — 

When one is able to exhibit some coupling (Y, Y) with Y = X and Y = X, then the law of 

belongs to U(C(X),C{X)) and necessarily W p (C{X), C{X)) < E 1 ^ [sup te[0jT] \Y t - Y t \ p 

the obvious coupling (Y, Y) = (X, X) obtained by choosing the same driving Brownian motion 
for the diffusion and its Euler scheme, one recovers the upper-bound in (0.5) from the strong 
error analysis. The main result of the present paper is the construction of a better coupling 
which leads to the upper-bound 

\fp > 1, Ve > 0, 3C < +00, > 1, W P (£(X), £(X)) < ( 



proved in Section 3 under additional regularity assumptions on the coefficients and uniform 
ellipticity. To construct this coupling, we first obtain in Section 2 a time-uniform estimation of 
the Wasserstein distance between the respective laws C(X t ) and C{X t ) of X t and X t : 

Vp > 1, 3C < +00, VAT > 1, sup W p (C(Xt),£(X t )) < C ^ lo f N \ 

te[o,T] iV 

Before, in Section 1, we recall well-known results concerning the moments and the dependence 
on the initial condition of the solution to the SDE (0.1) and its Euler scheme. Also, we explicit 
the dependence of the strong error estimations E[sup s<t \X S — X s \] with respect to t 6 [0,T], 
which will play a key role in our analysis. 



1 Basic estimates on the SDE and its Euler scheme 

We recall some well-known results concerning the flow defined by (0.1) (see e.g. Karatzas and 
Shreve [18], p 306) and its Euler approximation. 

Proposition 1.1 Let us denote by (Xf )te[o,T] the solution of (0.1), starting from i£l. One 
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has that for any p > 1, the existence of a positive constant C = C(p, T) such that: 



Vx G R, E sup \X X \ p 
te[Q,T] 

Vx G R, Vs < t < T, E 
Vx, y G R, E 



sup |xf-xfp> 

te[o,T] 



< C(l + |x|) p 

<C(l + |x|) p (i-s)i 
< C|i/-af 



sup 



(1.1) 
(1.2) 
(1.3) 



Proposition 1.2 Let (X x )te\0,T] denote the Euler scheme (0.2) starting from x. For any p G 
[l,oo), there exists a positive constant C = C(p,T) such that 



sup \X?\p 
te[o,T] 



ViV > 1, Vx G R, E 
V7V > 1, Vx G R, V/ G [0,T], E 



<C(l + |x|) p 



sup \X x r -X x r \ v 

r€[0,t] 



< 



Ct2(l + \ X \)P 
p * 



(1.4) 
(1.5) 



The moment bound (1.4) for the Euler scheme holds in fact as soon as the drift and the dif- 
fusion coefficients have a sublinear growth. The strong convergence order is established in 
Kanagawa [17] for Lipschitz and bounded coefficients. In fact, it is straightforward to extend 
Kanagawa's proof to merely Lipschitz coefficients by using the estimates (1.1) and (1.4) and 
obtain 

C(l + \x\)P 



ViV > 1, Vx G R, Vt G [0,T], E 



sup \X x r -X x r \P 

re[0,t] 



< 



(1.6) 



The estimate (1.5) precises the dependence on t. This slight improvement will in fact play a 
crucial role to construct the coupling between the diffusion and the Euler scheme. We prove it 
for the sake of completeness even though the arguments are really standard. 
Proof of (1.5). Let r s = sup{£j,tj < s} denote the last discretization time before s. We have 
Xf - Xf = f* b(X x J - b(X x )ds + )l o-(X x J - a(X x )dW s . By Jensen's and Burkholder-Davis- 
Gundy inequalities, 



E 



sup \X X -X x \p 
re[o,t] 



< 2 P E 



< 2 P t p 



r \b(x x j - bix^ 

J o 



+ C V E 



f\a{X x Ta )-a(X x )fds 



E 



[\b(X x Ts ) - b(X x )\P] ds + C/- 1 j\[\a{X x s ) - a(X x )\P] ds 



Denoting by Lip(o~) the finite Lipschitz constant of a, we have \o~(X x g ) — a(X x )\ < Lip(a)(\X x s 
X x s \ + \X x s - X x \). Thus, (1.2) and (1.6) yield E[\a{X x s ) - a(X x )\P] < C(1 +| c|)P , and the same 
bound holds for b replacing a. Since t p < TPl 2 t p l 2 , we easily conclude. I 



2 The Wasserstein distance between the marginal laws 

In this section, we are interested in finding an upper bound for the Wasserstein distance between 
the marginal laws of the SDE (0.1) and its Euler scheme. It is well known that the optimal 



4 



coupling between two one-dimensional random variables is obtained by the inverse transform 
sampling. Thus, let F t and F t denote the respective cumulative distribution functions of X t and 
Xf The p-Wasserstein distance between the time-marginals of the solution to the SDE and its 
Euler scheme is given by (see Theorem 3.1.2 in [24]): 

w p (c(x t ),c(Xt)) = QVrV)-^V)r^) /P ■ (2.1) 

Let us state now the main result of this Section. We set: 

= {/ : R ->■ R k times continuously differentiable s.t. ||/^||oo < °o, < i < k}. 

Hypothesis 2.1 Let a = a 2 . We assume that 

3a > 0, Vx G R, a{x) > a (uniform ellipticity), 

a G C 2 and a" is globally ^-Holder continuous with 7 > 0, 

b e Cf 

Since a is Lipschitz continuous, under Hypothesis 2.1, we have either a = ^Ja or a = —^[a. 
From now on, we assume without loss of generality that a = yja which is a C 2 function bounded 
from below by the positive constant a = y/a. 

Theorem 2.2 Under Hypothesis 2.1, we have for any p> 1, 

ViV > 1, sup W p (C(X t ),C(X t )) < C ^ lo f {N \ 
te[o,T] ^ 

where C is a positive constant that only depends on p, T, a and (||aW||oo, Wb^ ||oo, < i < 2) 
and does not depend on the initial condition x £ R. 

Remark 2.3 When p = 1, the slightly better bound sup te [ ^ Wi(£(Xt), C(Xt)) < holds if a 
is uniformly elliptic, according to [28] chapter 3. This is proved in a multidimensional setting 
for C°° coefficients a and b with bounded derivatives by extending the results of [13] but can also 
be derived from a result of Gobet and Labart [12] only supposing that b,a 6 C|. Let pt(x,y) and 
pt(x,y) denote respectively the density of X t ' x and X t ' x . Then, Theorem 2.3 in [12] gives: 

c2 , , ..x . , „.m / TK{T) / c\x-y\ 2 



\/(t,x,y)e(0,T]xR z , \p t ( X: y)-p t ( X: y)\<—^exv\- / 

As remarked in [28] chapter 3, for f : R — > R a Lipschitz continuous function with Lipschitz 
constant not greater than one, one deduces that 



\E[f(x t )]-nf(x t )}\ = 



/ (/(y) - f{x))(pt{x,y) -p t (x,y))dy 



< *£K [ 1, _ exp ( dy = K ^ T 



Nt I," V t a cN 



which gives sup t<T Wi(C(Xt), C{Xt)) < by the dual formulation of the 1-Wasserstein dis- 
tance. 
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Our approach consists in controlling the time evolution of the Wasserstein distance. To do so, 
we need to compute the evolution of both F t _1 (n) and F^ l (u). In the two next propositions, 
we derive partial differential equations satisfied by these functions by integrating in space the 
Fokker-Planck equations and then applying the implicit function theorem. 

Proposition 2.4 Assume that Hypothesis 2.1 holds. Then for any t G (0, T], the cumulative 
distribution function x i->- F t (x) is invertible with inverse denoted by F^~ l (u). Moreover, the 
function (t,u) >-> F t _1 (u) is C 1 ' 2 on (0, T] x (0, l)and satisfies 

dtF t -\u) = -k (f^-) + KF t -\u)). (2.2) 
2 V d u F t (u) J 

Proposition 2.5 Assume that a and b have linear growth : 3C > 0, Vx G R, |<r(x)| + \b(x)\ < 
C(l + \x\) and that uniform ellipticity holds : 3a > 0, Vi 6 R, a(x) > a. Then for any 
t G (0, T], X t admits a density pt{x) with respect to the Lebesgue measure and its cumulative 
distribution function x H> F t (x) is invertible with inverse denoted by F t _1 (ii). Moreover, for each 
k G {0, . . . , N — 1}, the function (t,u) >->■ F^~ l (u) is C 1 ' 2 on (£jt,tjfc+i] x (0, 1) and, on this set, it 
is a classical solution of 

d t F t ~\u) = -\d u ( -2$. - ) +0t(v). (2.3) 



2 \d u F t -\u) 

where a t {u) = E[a(X tk )\X t = F^{u)] and (3 t {u) = K[b(X tk )\X t = Ff 1 ^)]. 

The proofs of these two propositions are postponed to Appendix A. Let us mention here that 
Proposition 2.4 also holds when b' is only Holder continuous: the Lipschitz assumption on b' is 
needed later to prove Theorem 2.2. The PDEs (2.2) and (2.3) enable us to compute the time 
derivative of the p-th power of the Wasserstein distance (2.1) and prove, again in Appendix A 
the following key lemma. 

Lemma 2.6 Under Hypothesis 2.1, forp > 2, the function 1 1— >■ Wp(£(X t ) , £(X t )) is continuous 
on [0, T] and its first order distribution derivative dtWp(C{X t ),C{Xt)) is an integrable function 
on [0,T]. Moreover, dt a.e., 

dtW*(C(X t ),C(X t )) < c(wP(C(X t ),C(X t )) + J 1 \F t -\u) - F^(u)r l \b(Fr\^)) ~ Pt{u)\du 

+ £ \F t -\u) - Ff\u)r 2 (a(F t -\u)) - a t (u)) 2 dn) , (2.4) 

where C is a positive constant that only depends on p, a, ||o ||oo and ||6 ||oo- 

The last ingredient of the proof of Theorem 2.2 is the next Lemma, the proof of which is also 
postponed in Appendix A. 

Lemma 2.7 Let Tt = sup{U, ti < t} denote the last discretization time before t. Under Hypoth- 
esis 2.1, we have for all p > 1 : 

( 1 \ p/2 

3C < +oo, ViV > 1, Vt G [0,71, E [|E [W t - W n \X t ] \ p ] < C . 
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Proof of Theorem 2.2. Since W p (C(X t ), C(X t )) < W p >{C{X t ), C{X t )) forp < p', it is enough 
to prove the estimation for p > 2. Therefore we suppose without loss of generality that p > 2. 
Let ^p(t) = W2(£(X t ), and 

f x 2 /p if x > 1, 

for any integer fc > 1, hk(x) = k~ ' p h(kx) where h(x) = < „ — ' 

(1 + £(x — 1) otherwise. 

Since hk is C 1 and non-decreasing, Lemma 2.6 and Holder's inequality imply that 
hk (< 2 (i)) = h k (WP{C(X ),C(X )))+j\' k (r p /2 (s)) d s W p (C(X s ),C(X s ))ds 

< h k (0) + C J\' k (r p /2 (s)) [< /2 (*) + 4 P ' m ( s ) (jf I^V)) - &(u)|*du) /P 

2/p- 

/>.-") \ II, ..... \ 



(is. 



Since for fixed x > 0, the sequence {h' k (x))k is non-decreasing and converges to -xp as 
A; — > oo, one may take the limit in this inequality thanks to the monotone convergence theorem 
and remark that the image of the Lebesgue measure on [0, 1] by F" 1 is the distribution of X s 
to deduce 

Mt) < — / M*) + < 2 (*)IE 1/P (\b(X s ) - E(b(X T ,)\X 9 )F) + E 2 ' p (|a(X s ) - E(a(X T J|X,)|f) ds. 
P Jo 

(2.5) 

One has 

a(X Ts ) - a(X s ) = a'(X s )a(X s )(W Ta - W s ) - a'(X s ) [(a(X Ts ) - a(X s ))(W s -W Ts ) + b(X Ts )(s - r s )] 

+ {X Ts - X s ) [ a'(vX Ts + (1 - v)X s ) - a\X s )dv. 
Jo 

Using Jensen's inequality, the boundedness assumptions on a, b and their derivatives and Lemma 
2.7, one gets 

E (\a(X s ) - E(a(X Ts )\X s )\P) < CE (\aa'(X s )\ p \E(W s - W T J\X s )\ p ) 

+ CE ((a - T S Y + \{a{X T3 ) - a(X s ))(W s - W Ts )\ p + \X Ts - X s \ 2p ) 

C 



NP /2 v (N Ps p/2y 

The same bound holds with a replaced by b. With (2.5) and Young's inequality, one deduces 

M) < c[ «,(») + + jf^j* < c[ Ms) + j^^d, 

One concludes by Gronwall's lemma. I 

Remark 2.8 When a(x) = a is constant, the term E 2 / p (\a(X 8 ) - E(a(X Ts )\X s )\ p ) in (2.5) 
vanishes and the above reasoning ensures that tp p (t) defined as sup s6 [ T ] ip p (s) satisfies 

Mt) < c jf Ms)ds + cj>y\t) jT ^ y \ N ^ s < c f Q Ms)ds + + 

By Gronwall's lemma, we recover the estimation sup te [ 0) r] W p (C(Xt) , C(Xt)) < jj which is also a 
consequence of the strong order of convergence of the Euler scheme when the diffusion coefficient 
is constant. 
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3 The Wasserstein distance between the pathwise laws 



We now state the main result of the paper. 

Hypothesis 3.1 We assume that a G C£, b G C$, and 

3a > 0, Vx G R, a(x) > a (uniform ellipticity). 

Clearly, Hypothesis 3.1 implies Hypothesis 2.1. 
Theorem 3.2 Under Hypothesis 3.1, we have: 

Vp > 1, Ve > 0, 3C < +oo, ViV > 1, W P (C(X) , £(X)) < 



C 



N'i- £ ' 



Before proving the theorem, let us state some of its consequences for the pricing of lookback 
options. It is well-known that if (Uk)o<k<N-i are independent random variables uniformly 
distributed on [0,1] and independent from the Brownian increments (Wt k+1 — Wt k )o<k<N-i 

then X d = imax <Kjv_i (x tk + x t k+1 + \J {X tk+1 - X tk ) 2 - 2a 2 (X tk )ti\n(U k )^ is such that 
(Xq, X tl X T , X^j = (X ,X tl , . . . , X T , max te [ 0jT ] X t ). 



Corollary 3.3 If f :W - 

Ve > 0, 3C < +oo, ViV > 1, 



is Lipschitz continuous, then, under Hypothesis 3.1, 

C 



E 



/ X T , max X t 
te[o,T] 



E[f(X T ,X)] 



< 



(3.1) 



To our knowledge, this result appears to be new. Of course, when / is also differentiable with 
respect to its second variable, one has 



E 



f [X T , max X t 

te[o,T] 



r+oo _ 

E[f(X T ,x )} + / E d 2 f(X T ,x)l {maXte[0T]Xt > x} 



dx. 



One could contemplate combining the weak error analysis for the first term in the right-hand- 
side with Theorem 2.3 [10] devoted to barrier options to obtain the order iV -1 instead on 
N -2/3+e in ( 3 In Theorem 2.3 [10], Gobet assumes C\ regularity and uniform ellipticity on 
the coefficients o and b and it is not clear whether the estimation is preserved by integration 
over [xo,+oo). More importantly a structure condition on the payoff function implying that 
d 2 f(x, x) = for all x > xq is needed. 



Proof of Theorem 3.2. We first deduce from Theorem 2.2 some bound on the Wasserstein 
distance between the finite dimensional marginals of the diffusion X and its Euler scheme X on 
a coarse time-grid. For m G {1, . . . , N — 1}, we set n = [N/m\ and define 

ImT 

si = — , for I G {0, . . . , n — 1}, and s n = T. 

Combining the next proposition, the proof of which is postponed in Appendix B with Theorem 
2.2, one obtains that 

W P (C(X S1 ,. . . , X Sn ),jT(X Sl , . . .,X Sn )) < ° V ^ N (3.2) 
where the constant C does not depend on (m, N). 
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Proposition 3.4 Let W 1 be endowed with the norm |(xi, . . . , x n )\ = maxi<;< n |x;|. For any 
p > 1, there is a constant C not depending on n such that 

W p (£(X Sl ,...,X s J,C(X Sl ,...,X Sn ))<Cn sup W p (C(Xf ), £(Xf )). 

0<t<T,xeR 

There is a probability measure 7r(dxi, . . . , dx n , dxi, . . . , dx n ) in U(C(X Sl , . . . , X Sn ), C(X Sl , . . . , X Sn )) 
which attains the Wasserstein distance in the left-hand-side of (3.2) (see for instance Theo- 
rem 3.3.11 [24]). Let tt(xi, . . . , x n , dx±, . . . , dx n ) denote a regular conditional probability of 
(xi, . . . , x n ) given (xi, . . . , x n ) when M 2n is endowed with tt and (Y Sl , . . . , Y Sn ) be distributed ac- 
cording to tt(X s1 , ... ,X Sn ,dxi, ... ,dx n ). The vector (X S1 , . . . , X Sn , Y Sl , . . . , Y Sn ) is distributed 
according to tt so that 



(Y sl ,...,Y Sn )±(X sl ,...,X Sn ) and E 1 ^ 



max \X S -Y s \ p 



< (3.3) 
m 



Let pt(x, y) denote the transition density of the SDE (0.1) and £t{x, y) = log(pt(x, y)). According 
to Appendix C devoted to diffusion bridges, the processes 

(w l t = f (dW s - a(X s )d x £ Sl+1 - s (X s ,X Sl+1 )ds) ,t G [s,,5,+i)) 

are independent Brownian motions independent from (X S1 , . . . ,X Sn ). We suppose from now on 
that the vector (Y Sl , . . . , Y Sn ) has been generated independently from these processes and so will 
be all the random variables and processes needed in the remaining of the proof (see in particular 
the construction of (3 below). Moreover 

f Z** = x + ^ a{Z^)dW\ + ^ [b(Z*' y ) + aHZ^)d x £ Sl+1 . s (Z^ , y)]ds, t G [s h s l+1 ) 

\z x si y +l =y 

(3.4) 

is distributed according to the conditional law of (Xt)te[ Sh s l+1 ] given (X S/ ,X S/+1 ) = and 

x x 

for each I £ {0, . . . , n - 1}, one has (Z t _ sp s ' +1 ) t e[ Sl ,s l+1 ] = (^)te[ Si ,s i+ i]- I n ord er to construct 
a good coupling between C(X) and C(X), a natural idea would be to extend (Y S1 , . . . , Y Sn ) to 
a process (^t)t e [o,T] with law £(X) by defining for each I G {0, ... ,n — 1}, (^)te[ S; ,s ;+1 ] as an 
Euler scheme bridge driven by W l and starting from Y Sl and ending at Y Sl+1 . Unfortunately, 
even if the Euler scheme bridge is deduced by a simple transformation of the Brownian bridge 
on a single time-step, it becomes a complicated process when the difference between the starting 
and ending times is larger than -j^ because of the lack of Markov property. We are finally going 
to choose the difference s/ + i — s; of order -^73 and therefore much larger than the time-step 
^. In addition, it is not clear how to compare the paths of the diffusion bridge and the Euler 
scheme bridge driven by the same Brownian motion. That is why we are going to introduce 
some new process (xt)te[o,T] such that the comparison will be performed at the diffusion bridge 
level, which is not so easy yet. 

To construct x, we are going to exhibit a Brownian motion (Pt)te[o,T] such that (Y Sl , . . . ,Y Sn ) 
are the values on the coarse time-grid of the Euler scheme with time-step jj driven by (3. The 
extension (^)* e [oT] with law C(X) is then simply defined as the whole Euler scheme driven by 

Y t = Y tk +a(Y tk )(f3 t - f3 tk )+b(Y tk )(t-t k ), t € [t k ,t k+1 ], < k < N - 1. 
The construction of f3 is postponed at the end of the present proof. One then defines 

Xt = Y Sl + [ o-{Xs)d(3 s + f b(x s )ds, t G [s h s i+1 ), < I < n - 1 

J St J Si 
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Notice that the process x = (Xt)te[o,T] which evolves according to the SDE (0.1) with j3 replacing 
W on each time-interval [s/,s/+i) is cadlag : discontinuities may arise at the points < 

I < n— 1}. We denote by Xs l+1 - its left-hand limit at time s/+i and set xt = Xs n -- The strong 
error estimation (1.5) will permit to estimate the difference between the processes Y and x- Of 
course, there is no hope for the processes x an d X to be close. Nethertheless, the process x 
obtained by setting 

V/G{0,...,n-1}, Vt€[ 8l ,s l+1 ), xt = zf H,X3l+1 - and XT = XT 

where Z x,y is defined in (3.4) is such that C(x) = £(x) by Propositions C.l and C.3. On each 
coarse time- interval the diffusion bridges associated with X and x are driven by the 

same Brownian motion W l . Moreover the differences \X Sl —Y Sl \ between the starting points and 
\X Sl+1 - Xs l+1 -\ < \X Sl+1 - Y Sl+1 \ + \Y Sl+1 - Xs i+ i-| between the ending points is controlled by 
(3.3) and the above mentionned strong error estimation. That is why one may expect to obtain 
a good estimation of the difference between the processes X and x- By the triangle inequality 
and since C(X) = C(Y) and £(x) = £(x)> 



W P (C(X),C(X)) < W p (C(X),C(x)) + w P {£(x), 



< E 1/p 



sup \Y t 

te[o,T] 



Xt\ 



+ 



Ei/p 



sup \X t -xt 

te[o,T] 



(3.5) 



where, for the definition of W P (C(X), C(x)) and W p (£(x), £(X)), the space of cadlag sample- 
paths from [0, T] to R is endowed with the supremum norm. Let us first estimate the first term 
in the right-hand-side. Let q > 1. From (1.5), we get 



E 



sup \Y t - X t\ pq 

te[si,si +1 ) 



Y 



si 



< C 



m-(l + \Y Sl \)Pi 



where the constant C does not depend on (N,m). We deduce that 

n-l 



E 



sup \Y t - Xt \ pq 
te[o,T] 



E 



max sup \Y t — Xt\ pq 
0<l<n-i te[suSl+l) 



< 



1=0 



E 



m n—l 

,m 2 



m 2 



<c^iVE[(i + |y S( |r] <c^- r , 



1=0 



Npq- 



sup \Y t - X t\ pq 

te[s;,s ; + 1 ) 



Y 



si 



where we used (1.4) for the last inequality. As a consequence, 



EVP 



sup \Y t - xt\ p 
te[o,T] 



< E 1/pq 



sup \Y t - X t\ pq 
te[o,T] 



< C 



i _ j_ 

m 2 pi 



(3.6) 



Let us now estimate the second term in the right-hand-side of (3.5). By Proposition C.3 and 
since for I G {0, . . . ,n — 1}, Xs t = Y v 



sup|X^ — Xt\ = max 

t<T 



sup 



0<«<n-l te[ Sl ,s l+1 ) 

Since, by the triangle inequality and the continuity of Y, 



Z^' X ^-Zf^-\ < C max \X s -Y Sl \V\X Sl+1 - X s l+1 -\. 

0<£<n— 1 



\X Sl+1 ~ Xs l+1 -\ < \X Sl+1 - Y Sl+1 \ + \ Y Sl+1 - Xs l+1 - \ < \X Sl+1 - Y Sl+l \ + sup \Y t - xt\, 



te[o,T] 
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one deduces that 



sup \X t - xt\ < C max \X Sl - Y Sl \ + sup \Y t - \t\ 

t<T U<Kn te[Q,T] 



Combined with (3.3) and (3.6), this implies 



E l/p 



sup|X t - xt\ p 

t<T 



< CE l/p 



max \X S - Y s \ p 

KKn 



+CE 1/p 



sup \Y t - xt\ p 
te[o,T] 



m 



N 



Plugging this inequality together with (3.6) in (3.5), we deduce that 

\ m N pi J 



and conclude by choosing m = [N a J and q > ^ . 

To end the proof, we still have to construct the Brownian motion /3. We first reconstruct on the 
fine time grid (tk)i<k<N & n Euler scheme (Yt k , < k < N) interpolating the values on the coarse 
grid (sz)i<K n . Let us denote by p(x,y) the density of the law J\f(x + b(x)T/N, <j(x) 2 T/N) of the 
Euler scheme starting from x after one time step T/N. Thanks to the ellipticity assumption, 
we have p(x,y) > for any x,y G R. Conditionally on (Y Sl , . . . ,Y Sn ), we generate independent 
random vectors 

C*si_i+ti , • • • , Y Sl _ 1+tm _ 1 )i<i< n -i and (Y Sn _ 1+tl , . . . , Y tN _ 1 ) 
with respective densities 



p(Y Sl l ,x 1 )p(x 1 ,x 2 ) ...p(x 



n— 1) Xs; 



J Rn -! p{Y Sl _ 1 ,y 1 )p(y 1 ,y 2 ) . . .p{y n - l ,Y Sl )dy l . . . dy n _ x 

P( Y s n - 1 ,Xl)p(x 1 ,X 2 ) ■ ■■P{x N -l- m ( n - 1 ),Y Sn ) 



and 



/ R jv-i- m (n-i) p(^„_i , yi)p(yi,V2) ■ ■ ■ p(vn-i —m(n—l) ! — m(n— 1) 



— ^ — 

and get immediately (l* fc )o<fc<n = (^ fc )o<fc<n- Then, thanks to the ellipticity condition, 



— — \{Yt k — Y tk _ 1 — b(Y tk l )) ) are independent centered Gaussian variables with vari- 

ance T/N. By using independent Brownian bridges, we can then construct a Brownian motion 

{Pt)te[o,T\ such that 

A fe - = —prp z(Yt k ~ Y t k _ x ~ b(Y tk l )), 

a ( Y t k -i) 



which ends the construction. 



Conclusion 



In this paper, we prove that the order of convergence of the Wasserstein distance W p on the 
space of continuous paths between the laws of a uniformly elliptic one-dimensional diffusion and 
its Euler scheme with iV-steps is not worse that jV~ 2 / 3+e . In view of a possible extension to 
multidimensional settings, two main difficulties have to be overcomed. First, we took advantage 
of the optimality of the inverse transform coupling in dimension one to obtain a uniform bound on 
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the Wasserstein distance between the marginal laws with optimal rate N up to a logarithmic 
factor. In dimension d > 1, the optimal coupling between two probability measures on M. d is not 
available, which makes the estimation of the Wasserstein distance between the marginal laws 
much more complicated even if, for Wi, the order iV -1 may be deduced from the results of [12] 
(see Remark 2.3). In the second place, one has to generalize the estimation on diffusion bridges 
given by Proposition C.3 which we deduce from the Lamperti transform in dimension d = 1. 
In the perspective of the multi-level Monte Carlo method introduced by Giles [8], coupling 
with order of convergence iV~ 2 / 3+£ the Euler schemes with N and 2iV steps would also be 
of great interest for variance reduction, especially in multidimensional situations where the 
Milstein scheme is not feasible (see [16] for the implementation of this idea in the example of 
a discretization scheme devoted to usual stochastic volatility models). But this does not seem 
obvious from our non constructive coupling between the Euler scheme and its diffusion limit. 
For both the derivation of the order of convergence of the Wasserstein distance on the path space 
and the explicitation of the coupling, the limiting step in our approach is Proposition 3.4. In this 
proposition, we bound the dual formulation of the Wasserstein distance between n-dimensional 
marginals by the Wasserstein distance between one-dimensional marginals multiplied by n. 
Even if the order of convergence of the Wasserstein distance on the path space obtained in the 
present paper may not be optimal, it provides the first significant step from the order iV 1 / 2 
obtained with the trivial coupling where the diffusion and the Euler scheme are driven by the 
same Brownian motion. 



A Proofs of Section 2 



Proof of Proposition 2.4. According to [6], Theorems 5.4 and 4.7, for any t G (0, T], Xt 

admits a density pt(x) w.r.t. the Lebesgue measure on the real line, the function (t,x) i-> p t (x) 
is C 1 ' 2 on (0, T] x R and, on this set, it is a classical solution of the Fokker-Planck equation 

d t p t (x) = ^d xx (a(x)pt(x)) - d x (b(x)p t (x)). (A.l) 
Moreover, the following Gaussian bounds hold 

3C > 0, Vt G (0,71, Vx e R, \ Pt (x)\ + Vt\d xPt {x)\ < ^e'^ 1 (A.2) 

The partial derivatives d x Ft(x) = pt(x) and d xx Ft{x) = d x pt(x) exist and are continuous on 
(0, T] x R. For < s < t < T and y < x, integrating (A.l) over [s,t] x [y,x], then letting 
y — > — oo thanks to (A.2), one obtains F t (x) — F s {x) = \d x (a(x)p r (x)) — b(x)p r (x)dr. By 
continuity of the integrand w.r.t. (r, x) one deduces that the partial derivative dtFt{x) exists 
and is continuous on (0,7] x R. So, (t,x) H> F t (x) is C 1 ' 2 on (0, T] x R and solves 

d t F t {x) = ±d x (a(x)d x F t (x)) - b{x)d x F t (x). (A.3) 

According to [1], the density is also bounded from below by some Gaussian kernel : 3c > 
0, V(£, x) € (0, T] x R, |pt(x)| > ~t . This enables us to apply the implicit function 

theorem to (t,x,u) >->■ Ft(x) — u to deduce that the inverse u >->■ F t _1 (n) of x >->■ Ft(x) is C 1 ' 2 in 
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the variables (t,u) G (0, T] x (0,1) and solves 
d t F t ~\u) 



= -^(a(x)a :r F t (x))| :c=F -i H 5 u F t - 1 H + 6(F t - 1 («)) 



1„ 

2 "v^r'w 



where we used (A. 3) for the second equality and d u F t (u) = 



d x F t (F- L (u)) 



for both the second 



and the third equalities. a 
Proof of Proposition 2.5. For t G (0, t±], Xt admits the gaussian density with mean 
xo + b(xo)t and variance a(xo)t. By induction on k and independence of Wt — Wt k and X tk in 
(0.2), one checks that for k G {1, ... , n— 1}, A ifc admits a density pt k (x) and that for t G (*fc , *fc+i] j 
(X tk ,X t ) admits the density 



p(t k ,t,y,x) =p tk (y) 



(x-y-b{y)(t-t k )) 2 
g 2a(y)(t-t fc ) 

V2vra(y)(t-t fc ) 



(j;-«-KH)(t-tfc)) 

2a(j/)(t-t fc ) 



The marginal density p t (x) = f R Pt k (y) e ^ 27ra(y)(f _^) ^ of ^ is continuous on (t k ,t k+1 ] x 
by Lebesgue's theorem and positive. 

Let N(x) = f x e~~ ^= denote the cumulative distribution function of the standard Gaussian 
law and k G {0, . . . , N— 1}. Again by the independence structure in (0.2), for (t, x) G (t k , t k +i] x 

R, F t (x) = E ( A 



.-x.^lH),, 0nehag 



^/a(X tfc )(t-t fc ) 



d,A 



x - y - b(y)(t - t k ) 
y/a(y)(t-t k ) 



+ 



b(y) 



x-y- b(y){t - t k ) 

2^2Tra(y)(t - t k f ' y/2ira(y)(t - t k ) 



(x-y-b(y)(t-t k )y 
e 2a(j/)(t-t fe ) 



By the growth assumption on a and b, one easily checks that V7c G {0, . . . , A}, E(A 2 fc ) < +oo. 
With the uniform ellipticity assumption and Lebesgue's theorem, one deduces that F t (x) is 
differentiable w.r.t. t with partial derivative 



d t F t {x) = -E 



Kx tk ) 



x-X tk -b{X tk ){t-t k ) + 

2^2ita(X tk ){t-t k f y/2ira(X t k )(t - t k ) 



(x-X tk -b(X tk )(t~t k )) 2 
2a(X tk )(t-t k ) 



(A.4) 

continuous in (t,x) G (t k ,t k+ i\ x R. In the same way, one checks smoothness of F t (x) in the 
spatial variable x and obtains that this function is C 1 ' 2 on (t k ,t k+ \] x R. 
When k > 1, 



E 



KA, 



( x -xt k -b(xt k )(t-t k )r 

, 2a(X tk )(t-t k ) 

y/2ira(X tk ){t - t k ) 



b(y)p(t k ,t,y,x)dy = E[b(X tk )\X t = x]p t (x). 



For A; = 0, even if (Ao,A t ) has no density, the equality between the opposite sides of this 
equation remains true. 
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Combining Lebesgue's theorem and a similar reasoning, one checks that 



-E 



X - A tk - b{X tk ){t - t k ) 2a(X t )(t-t k ) 



^a{X tk ){t-t k f 



d x E 



g 2a(X tk )(t-t k ) 



y/2ira(X tk )(t - t k ) 
8 X [E(a(X tk )\X t = x)pt(x)] . 



With (A. 4), one deduces that 



1 



d t F t (x) = -d x {E[a(X tk )\X t = x]d x F t (x)) - E[b(X tk )\X t = x]8 x F t (x). (A.5) 

One checks that the function (t,u) i-> is smooth and satisfies the partial differential 

equation (2.3) by arguments similar to the ones given at the end of the proof of Proposition 2.4. 



Remark A.l In the same way, for k £ {0, . . . ,N — 1}, one could prove that on (t k , th+i] x 
(t,x) i->- p t (x) is C 1 ' 2 and satisfies the partial differential 

dtPt(x) = \d xx (E[a(X tk )\X t = x]p t (x)) - 8 X (E[b(X tk )\X t = x]p t (x)) . 

obtained by spatial derivation of (A.5). This is related to [14]- 



Proof of Lemma 2.6. By the continuity of the paths of X and X and the finiteness of 
E [sup t < T (|A t |f +1 + \X t \P +1 )}, one easily checks that t i-> W? p {C{X t ), C{X t )) is continuous. 
Let k £ {0, . . . , N — 1} and s, t £ (t k , t k+ \] with s <t. Combining Propositions 2.4 and 2.5 with 
a spatial integration by parts, one obtains for e £ (0, 1/2) 

f £ \F t -\u)-F-\u)\Vdu = J 1 £ \F-\u)-F-\u)Y>du 

+ pf £ E \ F r» - Fr\u)r\F~\u) - F-\u)){b{F-\u)) - (3 r (u))dudr 



+ 



Pip ~ 1) 



t /•!-£ 




\F~\u) - F-\u)\v-\d u F-\u) - d u F-\u)) 



a{F-\u)) ar(u) 



+ fjf l^r -1 (l - e) - F-\l - e)r\F~\l - e) - F~\\ - e)) 



8 u Fr\u) d u Fr l (u) 

a r (l-e) a(F,T 1 (l-e)) 



dudr 



- \ I \F~He) - F-\s)r\F-\e) - F~ 1 (e)) 



d u F r -\l-e) d u Fr\l-e) 
a{F-\e)) 



dr 



(e) d u F r -\e) 



dr 



(A.6) 



We are now going to take the limit as e — > 0. We will check at the end of the proof that 



lim sup ^CifW ^) ~ K\u)\^ + sup ^0-AF t -\u) - F t -\u)\^ = 0. 

«^0+ or l- re [ Sjt ] d u F t (U) r<=[ s ,t] O u F t (u) 

(A.7) 

which enables us to get rid of the two last boundary terms. 
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Combining Young's inequality with the uniform ellipticity assumption and the positivity of 
duFf 1 ^) and one obtains 



(d u F-\u)-d u F-\u)) 
= (a(F r -\u))-a r (u)) 



a(F r 1 (u)) a r {u) 



d u F r - l {u) d u Fr\u) 
duF£^a) - d u F~ l {u) 
d u F^\u)y d u Fr\u) 



, v - l( , A(duF-\u)-d u F-\u))+f ((d u F-\u) - d u F-\u))+r 

— a[r r (u)) : = — --. a r [u) --. = — --. 

V r V " d u F- l (u)d u F r -\u) rV ' d u F-\u)d u F r ~\u) 

4a (0 u F r '(u) V d u F r L {u)) 2 d u F r \u)d u F r L {u) 

< ±(a{F-\u)) - a r {u)f . 

Hence, up to the factor p( - p 2 ~ 1 ' > , the third term of the right-hand-side of (A. 6) is equal to 



/ / " \F-Hu) - F-\u)r> \{d u F-\u) - d u F-\u)) ~ ^Tt) 

Js Je „ \d u F r L (u) d u F r \u)J 

(a(F-\u)) - a r {u)) 



4a 



dudr + -!- f f £ \F~\u) - F-\u)\ p - 2 (a(F-\u)) - a r {u)f dudr. 



4a Js J£ 



where the integrand in the first integral is non positive. Since 

f f \F-\u) - F~\u)r 2 {\F-\u) - F-\u)\\b{F-\u)) - (3 r (u)\ + (a(F-\ U )) - a r {u)f) dudr 

Js JO 

<2\\b\\ 0O f Wl~\C{X r ),C(X r ))dr + ^at, fwf\C(X r ),C(X r ))dr<+oo, 

J s J s 

one can take the limit e — > in (A. 6) using Lebesgue's theorem for the second term of the right- 
hand-side and combining Lebesgue's theorem with monotone convergence for the third term to 
obtain 

W p p (C(X t ),C(X t )) = WP(C(X S ),C(X S )) 

+ pf I \F-\u) - F-\u)r\F-\u) - F-\u)){b{F~\u)) - f3 r {u))dudr 



s JO 



2 Js Jo XduFr^u) d u F r l (u)J 



(A.8) 

The last term which belongs to [— oo, +oo) is finite since so are all the other terms. We deduce 
integrability of 



(r,u) ^ \F-\u) - F- l {u)\v- 2 {d u F-\ U ) - d u F~\u)) 



a(F r 1 (u)) a r (u) 



d u F r - L (u) d u Fr (u) 



on [s,t] x (0, 1). Similar arguments show that the integrability property and (A.8) remain true 
for s = tk- By summation, they remain true for < s < t < T. So integrability holds on [0,T] 
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for the distribution derivative 

d t W?(C(X t ),C(X t )) =p f 1 \F t -\u) - Ft\u)r\F t -\u) - (u)){b{F^ («)) - fr(u))d 

Jo 

+ ^ jf \Ff\u) - F t -Hu)r\d u F f \u) - O u F t -\u)) (^01 - M-] du 



<p f\Ft\u) 
Jo 



F t -\ U )r 



(F t -\u)-F t -\u))(b(F^(u))-p t (u)) + 



a(F t -\u))_ _ a t (u) 
t (u) d u F t ~\u), 

(p-l)(a{F t -\u))-a t {u)y 



8a 



Equation (2.4) follows by remarking that 

(aiF-'iu)) - a r (u)f < 2 (\\a> \\lo\Ff 1 (u) - F t ~\u)\ 2 + (a^H) - a t {u)) 2 ) 
and using a similar idea for \b{F^ l (u)) — 

To prove (A. 7) for < s < t < T, we use the Aronson estimates recalled in the proof of 
Proposition 2.4 for Xt and deduced from Theorem 2.1 [22] for the Euler scheme. 



c / (x — xq) 2 \ _ _ C ( (x — xo) 2 

— j= exp ( < p r (x) A p r [X) < p r [X) V p r (x) < —= exp 



cr 



Cr 



(A.9) 



Setting Ki = c\ = as/2, K 2 = ^= and c 2 = Ct/2, one 



has 



Vr G [s,t], Vx G R, exp ( - (x X °H < p r {x) < K 2 exp (- (x X ° )2 



2ci 



2c 2 



(A.10) 



where p r denotes either p r or p r . The four limits in (A. 7) can be obtained similarly, and we focus 
on the one of sup rg r si i °(-^_ (")) | j?^ 1 ^) — F 7 T 1 ('u)| p ~ 1 . Up to modifying K\ > and decreasing 
ci > 0, we get from (A. 10) that 

Vr G [s, t], Vx < xq — 1, Ki(xq — x) exp ( — - — - — — ] < p r (x) < K2(xq — x) exp 



2ci 



2c 2 



which leads to 



Vx < x - 1, Kici exp M x x o) 2 ^) < Gr(x) < K2C2 exp f _ (s ^o) 2 

V 2ci / V 2c2 

where G r denotes either F r or F r . Thus, the inverse function satisfies 



xo 



-2c 2 log(-— ) < F~ (u) < x - /-2 Cl log(- 

A2C2 V ^lCl 



(A.ll) 



for u small enough. The two last inequalities imply that when x — > —00, 



Vr G [s,t], F~ 1 (F r (x)) > x - W-2c 2 



log( 



KlCi _ (x - X ) 2 

K 2 c 2 2ci 



and sup f [ (i \x — F 1 (F r (x))\ = 0(x). With the boundedness of a and (A. 10), we easily 
deduce that 

lim sup a(x)p r (x)\x - F- 1 (F r (x))\ p ~ 1 = 0. 
re[ s ,t] 
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Since, by (A. 11), F r 1 (u) converges to — oo uniformly in r G [s,t] as u tends to 0, we conclude 
that 

u->0+ re [ Sjt ] d M F r (u) 

I 

Proof of Lemma 2.7. By Jensen's inequality, 

E [|E(W t - W Tt \X t )\*] < E [\W t - W Tt \P] < J^. 
Let us now check that the left-hand-side is also smaller than tP /^ NP ■ To do this, we will study 

E [{W t - W Tt )g(X t )] 
where g is any smooth real valued function. 

In order to continue, we need to do various estimations on the Euler scheme and its stochastic 
derivatives. Let rjt = minjtjjt < ij} denote the discretization time just after t. We have 
D u X t = for u > t, and 

D u X t = l {t < Vu} a(X Tt ) + l {t>Vu} (1 + a'(X Tt )(W t - W Tt ) + b'(X n )(t - r t )) D u X Tt for u<t. 



Then by induction, one clearly obtains that for u < t, 
D u X t = a(X Tu )£ Ujt , 



l 



;i + b'(X Tt )(t - T t ) + a'(X Tt )(W t - W Tt )) 



if n < rj u 
if Vu = T t 



Y[Z^{ l + h, ^){u + i-ti) + ^{x ti ){w u+1 -Wt i )) if vu<n 

x (1 + b'{X Tt )(t - r t ) + a'(X Tt ){W t - W Tt )) . 



Note that £ satisfies the following properties: 1. £ U; t = £ v ( u ),t an( i 2. £ti,tj£tj,t = £u,t for 
U < tj <t. We also introduce the process £ defined by 

£ u ,t = exp (^j'b'iXs) - ^a'(X s ) 2 ds + j\'{X s )dw)j . 

The next lemma, the proof of which is postponed at the end of the present proof states some 
useful properties of the processes £ and £. 

Lemma A. 2 Let us assume that b,a 6 Cf . Then, we have: 



sup E 

0<s<t<T 



f-P 

c s,t 



+ E[e* t ]<C, sup E[£l]<C, 

0<s<t<T 



sup E [\D u £ S)t \ p + \D u £ S)t \ p ] < C, 

0<s,u<t<T 

sup E [\£ 0)t - £o,t\ P ] < -^w, 

0<t<T A 2 



(A.12) 
(A.13) 

(A.14) 



where C is a positive constant depending only on p and T. 
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We next define the localization given by 

^ = <P ( £ o,t { £ o,t ~ £o,t)) 
Here ip : R — > [0, 1] is a C°° symmetric function so that 



<p(x) 



0, if |x| > \ 

1, if \x\ < \. 



One has 

E [(W t - W Tt )g(X t )] = E [(W t - W n )g(X t )^] + E [(W t - W Tt ) 5 (X t )(l 
= / E[^(X t )D tt X t ]d« + E / D u ipdu 

Jrt Y J r t . 



+ E[(^ ( -^) S (X t )(l- 



where the second equality follows from the duality formula (see e.g. Definition 1.3.1 in [23]). 
Since for Tt < u < t 



E [H{X t )D u X t ] = E [ip 9 '(X t )a(X n )] = t^E \ f D s g{X t ) 

Uo 



D s X t 



t _1 E 



one deduces 



g(X t ) MXr^iX^S^dWs 
Jo 



l SW, 



+ E 









1 !\ 


[/' 


<M 


J n 


.Jo 




/ D u ipdu 


x t 









X, 



du 



+ E[(W t -W Tt )(l-i>)\X t ]. (A.15) 



Here 6W denotes the Skorohod integral. In order to obtain the conclusion of the Lemma, we 
need to bound the L p -norm of each term on the right-hand-side of (A.15). In particular, we will 
use the following estimate (which also proves the existence of the Skorohod integral on the left 
side below) which can be found in Proposition 1.5.4 in [23]: 

fw^)"" 1 {X Ts )£;jSW s < C{p) \\^a(X Tt )a- 1 {X T ) £~ t \ p , (A.16) 

JO z> 

By Jensen's inequality for p > 2, 



where \\F.f ljt = E [(/o*^ 2 ^)^ + (/„' J^D u F s ) 2 dsdu) 
we have 

\\F.\\ p lp < i p/2_1 ['EWF^ds + tP- 2 f f ' E[\D u F s \P]dsdu, (A.17) 
jo Jo Jo 

and we will use this inequality to upper bound (A.16). When 1 < p < 2, we will use alternatively 



the following upper bound || F. \ \\ p < E[F 2 ]ds) P + (jj J* E[(D u F s ) 2 ]dsduY that comes 
from Holder's inequality. 

For tp > 0, £oj > \ £ 0,t > 0. From Hypothesis 3.1, there are constants < a < a < oo such that 
< a < a < a, and one has 



jf E [(^a(^)a- 1 (* T .) ds < J* E [v>%7^ w 



(is 



< 



I)' 



Eft 



o,t 



E[|4„ (fl) | 2 P]cfc <Ci, 
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by using the estimates (A. 12). 

Next, we focus on getting an upper bound for 



f f\\\D u {MXrM' 1 {X Ts ) £- t 
Jo Jo L 



-1\ IP 



dsdu. 



(A.18) 



To do so, we compute the derivative using basic derivation rules, which gives 

D u ^a{X Tt )a~ l (X Ts ) £~l) = D^X^c' 1 (X Ts ) £~} + ^a' {X Tt )D u X Tt <j- 1 (X Ts ) £~l 

- ^a{X Tt )a- 2 a' (X Ts ) a(X Tu )£ u ^£;}l u < Ts 

- MXrJa' 1 {X Ts ) £-?D u £ s , t . (A.19) 

One has then to get an upper bound for the L p -norm of each term. As many of the arguments 
are repetitive, we show the reader only some of the arguments that are involved. Let us start 
with the first term. We have 

D u tP = if' (£ ~l (£ ,t - £o,t)) D u £~l (£ ,t - £ 0;t ) 
and D u £ Q ^ (£ ,t - £o,t) = £^t D u£o,t£o,t-£o~J D u £ Qjt . From the estimates in (A.12) and (A.13), 



we obtain 



sup HAjV'IL ^ W\\ooC(p). 
ue[o,t] 



(A.20) 



Since £ S J = £ ^ {s) £ J and £ ,t > ±£ ,t > if (p' (s j (£ ,t - £o,tj) ¥= 0, we have 



E 



iD^aiX^a' 1 (X Ts ) £ 



-l \p 



s,t 



< (^J\\D u ^\f 2p E 



£o,t £q,v(s) 



2p 



1/2 



Similar bounds hold for the three other terms. Note that the highest requirements on the 
derivatives of b and a will come from the terms involving D u £ in (A.19). Gathering all the 
upper bounds, we get that ^{X^a' 1 (X T .) £~t\\ P lp < C(fl 2 + t p ) < Ct p / 2 since < t < T. 
From (A. 16), we finally obtain 



f ^a(X Tii )a- l {X Ts )£;}5W s 
Jo 



We are now in position to conclude. Using Jensen's inequality, the results (A. 15), (A.12), (A. 14), 
(A.20) and the definition of tp together with Chebyshev's inequality, we have for any k > that 



E[\E[W t -W Tt \X t ]\ p ] 

<C\t-P(t-T t )P 



I* ^{X^a-^X^E-lSW, 
Jo 

+Vn\Wt-W Tt \ 2 P)4 k / 2 (E(\£ , t - £oA 2k m£o?)) 

<C^ 2 {t-T t f+(t-T t Y+{^ 



1/4 



1 f \\DuH p p 

Jr t 



du 



\ (2p+fc)/4 N 



1 1 

+ 



tP/ 2 NP TV 



£ I * 
2 T 4 
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Proof of Lemma A. 2. The upper bounds (A. 12) on £ and £ are obvious since b' and a' are 
bounded. Now, let us remark that £ and £ satisfy 



£u,t = i + 



a'{X s )£ UjS dW s + j b'(X s )£ u , s ds, 

J u 

£ Vu ,t = l+ [ a'(X Ts )£ Vu , Ts dW s + [ b'(X Ts )£ Vu , Ts ds. 

Thus, (A. 14) can be easily obtained by noticing that (X t ,£o,t) is the Euler scheme for the SDE 
(X t ,£ 0jt ) which has Lipschitz coefficients, and by using the strong convergence order of 1/2 (see 
e.g. [17]). 

The estimate (A. 13) on D u £ is given, for example, by Theorem 2.2.1 in [23]. On the other hand, 
we have for r)(s) < u < t 

Du£ V3 ,t = (?'(X Tu )£ Vs:Tu + I [a"(X Tr )a(X Tu )£ Vu ^ r £ v ^ Tr + a'{X Tr )D u £ Vs:Tr ] dW r 

+ f [6 // (X r Ja(X r J^, rr ^, rr + 6'(X Tr ) J D u ^, rr ] dr. 

In order to obtain a L p (£l) estimate, we then use (A. 12), b, a G C% and Gronwall's lemma. I 

B Proofs of Section 3 



Proof of Proposition 3.4. We use the dual representation of the Wasserstein distance (0.6) 
deduced from Kantorovitch duality theorem (see for instance Theorem 5.10 p58 [29]) : 



Wp(/U,z/)= sup I / (j)(x)n(dx) — I <j)(x)v(dx] 
sgl 1 ^) \Je Je 



where 4>{x) = mi ye E {4>{y) + \y — x\ p ). 

We also denote by (X^' x ) t ^ s the solution to (0.1) starting from x G M. at time s G [0,T] and 

— fx 

by (A t J ' )te[tj,T] the Euler scheme starting from x at time tj with j G {0, . . . , N}. It is enough 
to check that 

Wk = W p (C(X Sl , . . . , X Sk ,X Sk+1 h ,...,X Sn k ),C(X Sl , . . . , X Sk _ 1 ,X Sk k 1 ,...,X Sn k 1 )) 

is smaller than C sup < t < T;Eg]R W P {£{X? ), C{Xf)) since W p {C{X Sl , . . . , X„ n ), C{X Sl ,...,X a J)< 
Ylk=i w k- For / : IR n — ^ M a bounded measurable function and 

f(xi,...,x n )= inf {f(yi,...,y n )+ max \yj -Xj\ p }, 
(yi,...,y n )m n l<'J<n 

we set f k (x u ...,x k )= E(f(x u ...,x k , X%£ k ,. . . , X^ Xk )). First choosing 

{vi,...,Vk-i, vk+i , ■ ■ ■ , vn) = (x 81 x Sk _ ± , , . . . , x%y* ) , 
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then conditioning to a(W s ,s < Sk) and using (1.3), next conditioning to a(W s ,s < Sfe-i) and 
using the dual formulation of the Wasserstein distance, one gets 

^ I Jl A si, • • • ,^s k ,^s k+1 ,...,A Sn ) — /(A Sl , . . . , A Sfe _ 1 , A Sfc ,...,A S „ J I 

<E( inf {/(X Sl ,...,X St 1 ,y fc ,X, Sfc '^,...,X s Sfc '^)+ max IX^'^ - X s s k,Xsk \ p ] 

/•/V V y S H' X »H Sfc-lA^j, \ 

- Jl^si, • • • , A s fc _i, -^s k ,...,A S „ J I 

< E ( inf {A(X S1 , . . . , X^,^) + - Xjf} - / fc (X Sl , . . .,X Sk _ 1 ,X a s k k - 1 ' X ' k - 1 )) 

< CE^^^- 1 ^),/:^- 1 ^))!^ ) < CsnpW^CiX^J^iX^J) 
<C sup W£(£(a?),£(A7)). 

0<t<7>e]R 



C Some properties of diffusion bridges 

Let us suppose that the SDE dX t = b(X t )dt+a(X t )dWt, Xq = x has a transition density pt(x, y) 
which is positive and of class C 1,2 with respect to (t, x) £ R+ x R. We check later in this section 
that this holds under Hypothesis 3.1. Then, the law of the diffusion bridge with time horizon T 
is given by (see for instance Fitzsimmons, Pitman and Yor [5]) 

p T -t(Xt,y)~ 



E[F(X u ,0<u<t)\X T = y]=E 



f(x u ,o< u < ty- 



, < t < T, 



PT(x,y) 

where F : C([0, i],R) — > R is a bounded measurable function. Indeed for g : R — > R measurable 
and bounded, using that X-j has the density Pt(x, y) then the Markov property at time t, one 
checks that 



E 



E 



F(X u ,0< u<t) 



PT-t(X t ,y) 
PT(x,y) . 



g{x T ) 



y=x T 



E 



F{X u ,0<u<t) [ g{y)pT-t{X u y)dy 



E [F(X u ,0 <u< t)E[g(X T )\X t }} = E [F(X U , 0< u < t)g{X T )] 



We thus focus on the change of probability measure 



dF y 
~d¥ 



PT-t(X t ,y) 



=:M t , 



Ft pt(x, y) 

so that E[F(X U , < u < t)\X T = y] = E^[F(X U , < u < t)} where W denotes the expectation 
with respect to F y . We define £t(x,y) = logpt(x,y). The process (M t ) t £[o,T) 1S a martingale, 
and by Ito's formula, we get dM t = M t d x £j--t(X t ,y)a(X t )dWt, which gives 

M t = exp dJ T -s(X s ,y)a(X s )dW s - ± j* dJ T - s (X s ,y) 2 a(X s ) 2 ds^ . 

Girsanov Theorem then gives that for all y G R, (Wf = Wf — J * d x lq-- s (X s , y)a(X s )ds) te {Q j-) is 
a Brownian motion under P y , so that {W^ T ) te \Q^ is a Brownian motion independent of Xq-. 
Moreover, we have 



dX t = [b(X t ) + dJ T - t (X t ,y)a(X t ) 2 ]dt + a(X t )dW t y , 



(CI) 
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which gives precisely the diffusion bridge dynamics. 



Conversely, we would like now to reconstruct the diffusion from the initial and the final value 
by using diffusion bridges. We have the following result. 

Proposition C.l We consider an SDE dX t = b(X t )dt + a(X t )dWt, Xq = x with a transition 
density pt(x,y) positive and of class C 1,2 on (t,x) G M.+ x R. Let (B t ,t > 0) be a standard Brow- 
nian motion and Z-y be a random variable with density pt( x ,u) drawn independently from B. 
We assume that pathwise uniqueness holds for the SDE 

dZ^ = [b{Z^)+dJ r -t{Z*>\y)o{Z^)^ Z^ = x, i€[0,T), (C.2) 

for any x,y G R, and set Z t = Z^' Zt for t G [0, T). Then, (Z t ) t e[o,T\ an ^ i-^t)te[o,T] have the 
same law. 

A consequence of this result is that (Z t ,t G [0, T]) has continuous paths, which gives that 
lim^7-_ Zt' y = y a.s., dy-a.e. 

Proof . Let t G [0,T) and F : C([0,t],R) ->• R and g : R ->• R be bounded and measurable 
functions. Since pathwise uniqueness for the SDE (C.2) implies weak uniqueness, we get 



E [F(Z*> y , < u < t)] = E y [F(X U , < u < t)} = E 
Thus, we have 



F(X u ,0<u<tf-I^l 
PT(x,y) 



E[F(Z u ,0< u < t)g(Z r )]=E 



F(X u ,0<u<t) [ pr-t(X t ,y)g(y)dy 



= E[F(X u ,0<u<t)g(X T )} 



Hence the finite-dimensional marginals of the two processes are equal. Since {Xt)te[o,T\ nas 
continuous paths and (Zt)te[o,T] nas ca dlag paths (continuous on [0, T) with a possible jump at 
T), this concludes the proof. I 

From now on, we assume that Hypothesis 3.1 holds. We introduce the Lamperti transformation 
of the stochastic process (X t ,t > 0). We define ip(x) = J'q and a(y) = — o ip^ 1 ^), 

X t = f ip(X t ) so that we have 

dX t = a(X t )dt + dW t ,t G [0,T]. (C.3) 

By Hypothesis 3.1, tp is a C 5 bijection, a G and both ip and ip~ l are Lipschitz continuous. 
We denote by pt(x,y) the transition density of X and £t(x,y) = log(p t (x, y)). 

Lemma C.2 The density pt(x,y) is C 1 ' 2 with respect to (t,x) G R^j_ x R. Besides, we have 

d x £ t (x, y) = — a(x) + g t (x, y), 

where gt(x,y) is a continuous function on R + x R 2 such that d&gt{x,y) and dygt(x,y) exist and 

VT > 0, sup \d x g t (x,y)\ + \d y g t (x,y)\ < oo. 
te[o,T], x,ym 
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Proof . It is well-known that we can express the transition density pt(x,y) by using Girsanov 
theorem as an expectation on a Brownian bridge between x and y. Namely, since a and its 
derivatives are bounded, we can apply a result stated in Gihman and Skorohod [7] (Theorem 1, 
Chapter 3, § 13) to get that pt(x,y) is positive and 

i t ^y) = _ (g -J^ 2 + J V a{z)dz + iog E (e-5 ftW+^+Ws+W-x-Wt))^ _ Iiog(2vrt). 

Clearly, £ t (x,y) is C 1 ' 2 in (t,x) £ R* + x R (we can use carefree the dominated convergence 
theorem for the third term since a £ Cj*), and we have 

e -i fi {a ' + a*)(x+W s +l(y-x-W t ))ds ft t-s^,, + + Ws+ s^ y _ & _ Wf) ) ds 

' e - \ So{a'+a 2 )(x+W a + ^{y-x-Wt))ds' 

This is a continuous function on R + x R 2 , and we easily conclude by using the dominated 
convergence theorem and a £ C^. I 

By straightforward calculations, we have 

Pt(x,y) = -^--pt{<p{x),<p(y)), 

and pt(x,y) is thus positive and C 1,2 with respect to (t,x). The diffusion bridge (C.l) is thus 
well defined. Since d x ttix,y) = ^h\d££t(<p(x),(p(y)), we get by Ito formula from (C.l) 

dX t = [a(X t ) + dJ T -t(X u tp(y))]dt + dW t v , dW y = dW t - dJ T -t(X t , <p(y))dt. 

Therefore, as one could expect, the Lamperti transform on the diffusion bridge coincides with 
the diffusion bridge on the Lamperti transform. 

Proposition C.3 Let Hypothesis 3.1 hold. There exists a deterministic constant C such that 

VTG (0,7], x,x',y,y'eR, sup \Z*' y - zf y | < C(\x - x'\ V \y - y'\), 

te[o,T) 

and in particular, pathwise uniqueness holds for (C.2). 




E 



Proof . For x,y £R, we consider the following SDE 

y - A t 



dZ x t ' y = dB t + 



T-t 



+ 9T-t(Zt' y ,y) 



dt, Z*' y = x , t € [0, T) 



(C.4) 



that corresponds to the diffusion bridge on the Lamperti transform X. We set At = Z^ ,y — Z* ' y 



for t £ [0, T) and x', y' £ R. We have 



y-y' -A t 
T-t 



dA t 



+ gT _ t (Z^,y)-g T _ t (Z?' y ,y^ 



dt, 



and thus d(|A t | V \y — y'\) = sign(A t )l|A t |>|£-£'|dA t . On the one hand, we observe that 
l\A t \>\y-y'\ [ sl S n (At)(y — y') — |A t |] < 0. On the other hand, g t is uniformly Lipschitz w.r.t 
(x, y) on t £ [0, T] by Lemma C.2, which leads to: 



d(\A t \V\y-y'\)<C(\A t \V\y-y'\), 
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for some positive constant C. Gronwall's lemma gives then \A t \ < e (\x — x'\ V \y — y'\). This 
gives in particular pathwise uniqueness for (C.4). 

Now, let us assume that (■^f' 3/ )tg[o,T) solves (C.2). Then, (p{Z^ ,y ) solves (C.4) with x = <p{x) 

and y = tp(y), and we necessarily have Z^' y = ip~ 1 (Zt^'' p ^) by pathwise uniqueness. Both 99 
and tp^ 1 arc Lipschitz, and we denote by K a common Lipschitz constant. Then, we get 

\Z? y - Zf' y '\ = \y-\zf x) ^ v) ) - cp- l (zf x ' My,) )\ < K 2 e CT (\x - x'\ V \y - y% 

which gives the desired result. I 
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